PRAGMATEXT: Annotating the C-ORAL-ROM Corpus with Pragmatic Knowledge

نویسنده

  • Ana González-Ledesma
چکیده

This paper outlines the first phase of the PRAGMATEXT project. The aim of PRAGMATEXT is to introduce pragmatic knowledge into the transcriptions of CORAL-ROM, a spontaneous spoken corpus of Spanish. The paper is divided in four sections. The first section presents the most relevant features of the C-ORAL-ROM corpus. The second describes the pragmatic-discursive annotation model. The phenomena tagged are: emotional discourse, argumentative operations, modalization operations, evidentiality, phraseological units with metaphoric meaning and speech acts in interrogative clauses. The third section, resolves the three challenges related to the implementation of such annotation model to the XML language: (1) pragmaticdiscursive operations are expressed at different grammatical levels (lexicon, prosody, syntax, etc.); (2) a linguistic unit can have as attributes different types of pragmatic information; (3) the pragmatic knowledge is not expressed by a closed word class. The fourth section discuss future work and mentions some uses of a corpus tagged with pragmatic knowledge such as in the field of man-machine conversational systems and teaching of Spanish as a foreign language.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Assessment of Pragmatic Knowledge in the Online General IELTS-Practice Resources: A Corpus Analysis of Writing Tasks

Motivated by the concept of Communicative Language Ability and the eminence of the IELTS exam, this study intended to scrutinize the representation of functional knowledge (FK) and socio-linguistic knowledge (SK) as sub-components of pragmatic knowledge in the writing performances of both tasks of the online General IELTS-practice resources across three band scores. This quantitative inter-scor...

متن کامل

The C-ORAL-ROM CORPUS. A Multilingual Resource of Spontaneous Speech for Romance Languages

The C-ORAL-ROM project has delivered a multilingual corpus of spontaneous speech for the main romance languages (Italian, French, Portuguese and Spanish). The collection aims to represent the variety of speech acts performed in everyday language and to enable the description of prosodic and syntactic structures in the four romance languages. Sampling criteria are defined in a corpus design sche...

متن کامل

The C-ORAL-ROM Project. New methods for spoken language archives in a multilingual romance corpus

C-ORAL-ROM is a multilingual corpus of spontaneous speech of around 1.200.000 words representing the four main Romance languages: French, Italian, Portuguese and Spanish.. The resource will be delivered in standard textual format, aligned to the audio source in a multimedia edition. C-ORAL-ROM aims to ensure at the same time a sufficient representation of spontaneous speech variation in each la...

متن کامل

The C-ORAL-BRASIL I: Reference Corpus for Spoken Brazilian Portuguese

C-ORAL-BRASIL I is a Brazilian Portuguese spontaneous speech corpus compiled following the same architecture adopted by the C-ORAL-ROM resource. The main goal is the documentation of the diaphasic and diastratic variations in Brazilian Portuguese. The diatopic variety represented is that of the metropolitan area of Belo Horizonte, capital city of Minas Gerais. Even though it was not a primary g...

متن کامل

An Annotation Tool for Multimodal Dialogue Corpora using Global Document Annotation

This paper reports a tool which assists the user in annotating a video corpus and enables the user to search for a semantic or pragmatic structure in a GDA tagged corpus. An XQL format is allowed for search patterns as well as a plain phrase. This tool is capable of generating a GDA timestamped corpus from a video file manually. It will be publicly available for academic purposes.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007